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Figure 1: Our concept of a tangible volume consists of a fully portable and self-contained device, entirely covered with screens. A virtual scene can 
be seen “through” the volume of the device (left image). This volume can be directly positioned within the virtual scene (middle image), and used 
to grasp and manipulate virtual objects (right image). A cubic-shaped device is shown here for illustration, but other volume shapes can be used. 


Abstract 

We present a new mixed reality approach to achieve tangible object 
manipulation with a single, fully portable, and self-contained device. 
Our solution is based on the concept of a “tangible volume.” We turn 
a tangible object into a handheld fish-tank display. Our approach, 
however, goes beyond traditional fish-tank VR in that it can be 
viewed from all sides, and that the tangible volume represents a 
volume of space that can be freely manipulated within a virtual 
scene. This volume can be positioned onto virtual objects to directly 
grasp them and to manipulate them in 3D space. We investigate this 
concept with a user study to evaluate the intuitiveness of using a 
tangible volume for grasping and manipulating virtual objects. The 
results show that a majority of participants spontaneously understood 
the idea of grasping a virtual object “through” the tangible volume. 

Keywords: Tangible user interface, 3D manipulation, mixed reality, 
fish-tank VR 

Index Terms: H.5.1 [Information Interfaces And Presentation]: 
Multimedia Information Systems—Artificial, augmented, and virtual 
realities 

1 Introduction 

Interaction with 3D data and, in particular, object manipulation [2] 
(selecting, translating and rotating 3D virtual objects) is of major 
importance in many fields, such as scientific visualization, prototyp¬ 
ing, and gaming. Until recently, these tasks were often carried out 
by advanced users on fixed workstations. With the development of 
mobile computing, however, people are now expecting to perform 
these tasks anywhere, with minimal set-up and learning. There is 
thus a need for an interface that is not only natural and efficient for 
3D manipulation, but also truly portable and self-contained. 

Tangible User Interfaces (TUIs) represent a promising approach 
for 3D manipulation. They consist of physical objects, or tangi¬ 
ble objects, that serve as real-world representations of digital data. 
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In many TUIs, tangible objects are used as physical “handles” for 
virtual objects. These interfaces take advantage of the user’s skills 
in interacting with physical objects [9], making them an attractive 
solution to support manipulation tasks in a natural and efficient way. 

However, interacting with virtual objects also requires visual 
feedback. Tangible UIs for 3D manipulation often rely on an external 
and fixed monitor for visual output [6]. Hence, such interfaces cannot 
be considered portable. Others use mobile devices as a portable 
display surface [11]. Although these interfaces are portable, they still 
consist of multiple separate pieces, which always have to be handled 
and carried together—they are not self-contained. Some TUIs take a 
further step and use the mobile display as a physical handle [5]. The 
device itself replaces the external tangible objects, thereby reducing 
the entire interface to a single portable and self-contained object. 
This solution, however, does not eliminate the distance between 
the display and the virtual objects. Since the mobile display now 
serves as the tangible handle, this separation creates problems during 
manipulation, such as a shifted center of rotation or manipulated 
objects leaving the field of view. 

In this paper, we examine a different approach. Rather than 
turning a mobile device into a tangible handle, we propose to turn a 
tangible object into a display. We present a first, partially simulated 
prototype in which the surface of a tangible object appears to be 
covered with multiple screens. We then use fish-tank rendering to 
display part of the virtual scene “through” the object. In contrast to 
object-oriented or geometric displays [12, 17], our tangible object 
becomes a tangible representation of a volume of virtual space. We 
introduce a “grasping” metaphor that consists in positioning this 
volume directly onto a virtual object, and pressing the fingers to pick 
up the object with the tangible volume. Thus, the separation between 
the tangible handle and the manipulated virtual object disappears. 
By solving the issues caused by this separation, our solution makes 
it possible to preserve the advantages of tangible manipulation in a 
fully portable and self-contained device. 

2 Background 

2.1 Tangible user interfaces for 3D manipulation 

The main idea behind Tangible User Interfaces (TUIs) is to give phys¬ 
ical form to digital data [9,10]. This is accomplished through the use 
of real-world objects—called tangible objects —that represent the 


















digital information. One particular type of TUI are Graspable User 
Interfaces [4], in which tangible objects serve as physical “handles” 
for virtual objects. Each handle can be attached to a virtual object. 
Users can then manipulate a virtual object by directly moving the 
corresponding tangible object. By taking advantage of the user’s 
preexisting skills in manipulating real-world objects [4, 9], this inter¬ 
action mode provides a natural and immediately efficient approach 
to 3D manipulation. 

A number of TUIs have been designed around this concept. For 
instance, Hinckley et al. [6] used passive tangible objects (“props”) 
tracked in mid-air to position and orient 3D medical data, The result 
was displayed on a separate computer screen. A limitation of these 
interfaces is, however, that they all require an external monitor or 
projector. This makes them essentially fixed installations. The lack 
of portability creates many constraints. Users need to go to a dedi¬ 
cated place in order to use the system. It cannot be carried between 
offices or brought home. Nowadays, with the rise of mobile com¬ 
puting such constraints appear even more limiting. There is a need 
for a truly portable interface that would offer the same advantages 
as fixed TUIs. Our first requirement is thus that the entire interface 
should be portable. 

Since mobile devices—such as smartphones and tactile tablets— 
have become widespread, they represent a readily-available solu¬ 
tion for use as a portable display surface. Some researchers have 
combined mobile device with tangible objects [11], resulting in 
portable TUIs. However, these interfaces also consist of multiple 
independent pieces: the mobile device and the tangible objects, all of 
which need to be carried together at all times. Whenever one of the 
tangible objects is missing, or when the mobile device is separated 
from the tangible objects, the interface becomes less functional or 
even unusable. The portability advantage is reduced due to the incon¬ 
venience of having to carry and keep the multiple pieces together. We 
can thus identify another requirement: the interface should not only 
be portable but should also consist of a single, self-contained device. 

2.2 Mobile device as tangible handle 

In addition to its role as a portable display surface, the mobile device 
itself can be seen as a tangible object. Several researchers have 
proposed to use the mobile device as tangible handle [5, 14] to ma¬ 
nipulate virtual objects in the space behind the device (generally 
in augmented reality). This eliminates the need for external tangi¬ 
ble props, resulting in an interface that is fully portable and self- 
contained in a single object: the mobile device. 

Although this solution fully meets the portability requirements, 
it is not, however, without drawbacks. Unlike previous approaches, 
there is now a separation between the virtual objects (behind the 
device) and the tangible handle (the device itself). This separation 
leads to problems during manipulation. In the technique proposed 
by Henrysson et al. [5], the manipulated object is fixed relative to 
the device. Since the center of rotation is now located on the device, 
it becomes difficult to rotate a virtual object without also translating 
it (Figure 2(a)). In the HOMER-S technique [14], translations and 
rotations are separately applied to the manipulated object, which 
makes it easier to rotate the object about itself. On the other hand, the 
object is no longer fixed relative to the mobile device and can thus 
leave the field of view during rotations (Figure 2(b)). The authors 
suggest to alter the control-display ratio to avoid this limitation, but 
such an indirect mapping would become less natural than direct 
tangible manipulation. 

One way to completely eliminate the separation and its associated 
problems would be to position the mobile device onto the virtual 
object before starting the manipulation. However, a typical rendering 
process would clip half of the virtual object in this situation—as 
well as everything else in front of the screen—making it difficult 
to manipulate the object properly (Figure 2(c)). In addition, most 
current mobile devices come in the form of flat one-sided displays. 



Figure 2: Manipulation issues that arise when a typical mobile device 
is used as a tangible handle: (a) the virtual object does not rotate 
about its own center; (b) the virtual object leaves the field of view; 
(c) when trying to reduce the distance to avoid the above issues, the 
virtual object is clipped. 


Even if clipping was not an issue, this form factor would still result 
in loss of visual feedback during manipulation. Therefore, we can 
identify two additional requirements: the portable and self-contained 
device should provide visual feedback on its entire surface, and 
should be able to display virtual objects without clipping. 

2.3 Fish-tank VR and geometric displays 

When rendering a 3D scene, the display is generally considered 
as the viewpoint on the virtual world. This causes clipping when¬ 
ever a virtual object crosses the screen plane. Fish-Tank Virtual 
Reality (FTVR) provides a solution to address the clipping problem. 
FTVR turns the display into a “window” seen from the viewpoint of 
the user [20]. This allows virtual objects to appear behind, onto, or in 
front of the display surface. There are two main ways to achieve this 
effect: stereo rendering, head-coupled perspective, or a combination 
of both [20]. According to Ware et al. [20], head coupling is a much 
stronger clue than stereo rendering, making it the preferred solution 
to implement this technique. 

As explained before, a flat one-sided display is not the best shape 
for use as a tangible handle. A number of works have proposed 
volumetric objects equipped with multiple displays, capable of pro¬ 
viding visual feedback on their entire surface. Some of these objects 
only support 2D rendering, such as the “Display Blocks” by Pla 
and Maes [16]. Others take advantage of FTVR to give the illusion 
of a 3D space inside the device. They are known as geometric dis¬ 
plays [17], or object-oriented displays [12]. One of the first examples 
is the MEDIA cube [8]: multiple LCD displays were arranged in a 
box shape, and combined with head coupling to give the illusion that 
a virtual scene was inside the box. The CoCube [3] is a tangible cube 
that produced the same illusion when seen through a head-mounted 
display (HMD). Unlike the previous example, this tangible cube 
was freely manipulable by the user. The Cubee [18] is a cubic device 
that achieved the same goal without a HMD, by using integrated 
displays and a head tracker. The pCubee [17, 19] is an evolution 
of Cubee that was made smaller and more portable. Most of the 
above devices have a cuboid shape because this is the easiest way 
to arrange conventional rectangular displays, but other shapes are 
possible (ex. arbitrary polyhedra, sphere). 

Geometric displays appear to meet all our requirements: they 
can be made small, portable, and self-contained, they provide visual 
feedback on all their surface, and with FTVR they can display virtual 
objects without clipping. However, previous work on this subject 




was mainly focused on the feasibility of such displays. Although 
significant technological contributions have been made, the potential 
of such devices for 3D interaction remains comparatively unknown. 
In addition, there seems to be a lack of examples of positioning the 
geometric display itself within a larger virtual scene, which is an 
essential part of our concept. Even though the pCubee can be used 
to navigate in the virtual scene, this is accomplished through an 
indirect velocity-based mapping. In contrast, our concept is based 
on a full 1-1 mapping between the real world and the virtual world, 
which has greater potential for direct 3D interaction. In this work, 
we offer a new perspective on geometric displays, by considering 
such a device as a portable and self-contained TUI in its own right, 
capable of both displaying a virtual scene and serving as a tangible 
handle within the scene itself. 

3 Concept: a tangible volume 

Based on the above considerations, we introduce the concept of a 
“tangible volume.” A tangible volume is a single physical object, 
sufficiently small and lightweight to be held in the hand. The surface 
of this object is entirely covered in screens on which the virtual scene 
is displayed. The perspective of each screen is adjusted to the user’s 
head position. As a result, part of the virtual scene appears “through” 
the object. The object is also tracked relative to the real world. Users 
can reach other parts of the virtual scene by moving the object in real 
space (Figure 3). At any point, the physical boundaries of the object 
“enclose” a corresponding part of the virtual scene. Therefore, this 
object is a tangible representation of a volume of virtual space that 
can be held in the hand and directly positioned into the virtual scene. 

We use this tangible volume as an interaction device for 3D object 
manipulation. First, the tangible volume is positioned onto a virtual 
object in the scene. From there, the virtual object is attached to 
the volume. The virtual object then follows the tangible volume in 
3D space as if it was directly held in the hand (Figure 1). Thus, there 
is no separation between the virtual object and the tangible handle. 

Our concept integrates all input and output into a single handheld 
object, used both to visualize the virtual scene and to manipulate vir¬ 
tual objects. Unlike the alternative approach of using a mobile device 
as a tangible handle, our interface also eliminates the separation from 
the manipulated virtual objects. Our concept thus constitutes a fully 
portable and self-contained interface for 3D object manipulation 
which avoids the problems described before. 

3.1 Object selection by grasping 

Having an interface made of an unique tangible object has clear 
benefits for portability, but also raises the question of how to in¬ 
teract with more than one virtual object. Such an interface must 
provide a way to attach and detach virtual objects from the single 
tangible handle. 

In TUIs made of multiple tangible objects, each tangible object 
can be linked to a different virtual object. This allows the user to 
interact easily with multiple virtual objects, simply by manipulating 
the corresponding tangible objects as desired. This is called “space 
multiplexing” [4]. In our case, there is only one tangible object avail¬ 
able to manipulate an arbitrary number of virtual objects. Therefore, 
the user has to select which virtual object should be linked to the tan¬ 
gible object at any given time. This is called “time multiplexing” [4]. 
While space multiplexing is considered more desirable than time 
multiplexing due to the lack of an object selection step which im¬ 
proves efficiency and lowers cognitive load [4], let us consider more 
closely what selection means in the specific case of tangible objects. 
Even though some interfaces allow the user to interact with multiple 
tangible objects concurrently, this is ultimately limited by the human 
capabilities. Tangible manipulation is typically performed with the 
hand(s). This means that all manipulation is accomplished through 
at most two effectors: the hands themselves. Any tangible object 
first has to be grasped with the hand in order to interact with the 





Figure 3: Illustration of our concept of a “tangible volume”. Part of a 
larger virtual scene can be seen through the tangible object held by the 
user. On the top right, the tangible volume is observed from a different 
angle. On the bottom right, the tangible volume has been translated 
in space and now encloses a different part of the virtual scene. 


corresponding virtual object. Therefore, there is still an implicit 
selection in space-multiplexed TUIs that occurs when an object is 
grasped with the hand. 

From that perspective, one of the main advantages of space mul¬ 
tiplexed TUIs is not actually the lack of a selection step, but rather 
the fact this selection is implicit and does not require thinking. In 
other words, reaching for an object and grasping it with the hand 
constitutes a natural way to select it. 

Grasping metaphor 

The tangible volume provides a unique opportunity to reproduce this 
form of selection with virtual objects: by considering the volume as 
an extension of the hand. As shown above, the tangible volume can 
be moved in space and positioned around a virtual object. Since the 
volume is held in the hand, and is surrounding a virtual object, the 
hand is also surrounding the virtual object. 

We thus designed a selection technique that consists in pressing 
the fingers on the tangible volume to “grasp” the virtual object 
located inside (Figure 4). Because the tangible volume is already 
held in the hand, the grasping technique is only triggered when finger 
pressure exceeds a given threshold. This threshold allows users to 
differentiate a conscious grasping action from normal manipulation 
of the tangible volume itself. Similarly, releasing the virtual object is 
done by releasing finger pressure below this threshold, while keeping 
hold of the tangible volume. 

With our grasping metaphor, the action of picking up a virtual ob¬ 
ject becomes very similar to picking up a real-world object: placing 
the hand around an object, and pressing the fingers to grasp it. 

3.2 Disambiguating between virtual objects 

A challenge that arises from our grasping metaphor is that fingers 
cannot penetrate the volume. When virtual objects are close to 
each other, more than one may be located inside the volume bounds. 
Since fingers cannot be directly used to grasp the desired object, there 
must be a way to indicate which object will be selected among those 
located in the volume. One possible solution is to display an outline 
around the object that is closest to the center of the volume (Figure 4). 

3.3 Bimanual manipulation 

Space-multiplexed TUIs allow the user to manipulate different vir¬ 
tual objects in each hand. Since our interface is made of a single 
object, this form of bimanual manipulation is not directly supported. 
One solution could be to use two tangible volumes—one for each 
hand. Since our grasping metaphor makes the tangible volume an 













positioning grasping manipulation release 



an outline indicates the object is now (optionally: the object 

the closest object attached to the volume falls due to gravity) 


Figure 4: Illustration of the different steps of object manipulation in our interface concept. First, the tangible volume is positioned onto a virtual 
object. To disambiguate between nearby objects, an outline indicates which object is both inside the volume and closest to its center. This object 
can be grasped by pressing the fingers on the volume. The object is then attached to the volume, and can be directly moved alongside the volume 
in 3D space. The object is detached when finger pressure is released. If virtual gravity is enabled, it then falls to the ground. 


extension of the hand, and all manipulation is accomplished through 
the hands, this solution would approximate space-multiplexed bi¬ 
manual interaction. However, the interface would then again consist 
of multiple pieces, which we specifically wanted to avoid. 

Still, as a single object, our interface supports another form of 
bimanual interaction: manipulating a single virtual object with both 
hands. This can be accomplished by holding the tangible volume 
with two hands, and pressing the respective fingers on opposite sides 
of the volume. 

3.4 Simulated physics 

In his dissertation on Graspable User Interfaces, Fitzmaurice [4], 
citing Norman [15], argues that the natural laws of physics that 
affect tangible objects help the user during manipulation. Indeed, in 
graspable UIs, releasing an object from the hand makes it drop to 
the floor due to gravity. This familiar behavior can help understand 
and predict its motion. 

However, in our interface, the tangible volume always remains 
in the hand. Releasing a virtual object simply detaches the virtual 
object from the volume. Without further intervention, the virtual 
object would remain there—floating in space. We can obtain a more 
realistic behavior by adding simulated physics to the virtual scene. 
With physics, the released virtual object falls to the virtual ground 
when released (Figure 4). Simulated physics also prevents virtual 
objects from moving through each other. This reinforces the illusion 
that they are solid, and thus can be grasped and manipulated directly. 

It should be noted that virtual physics is not always desirable. For 
example, a complex manipulation task may have to be decomposed 
in several steps in mid-air (clutching). With virtual gravity, releasing 
the virtual object between each step would cause it to drop from its 
intermediate positions. Virtual physics enhances realism, but is not 
necessarily appropriate for every application. 

4 Implementation 

In order to study the usability of our concept, we designed a prelimi¬ 
nary prototype. As said before, our work primarily focuses on the 
interaction capabilities afforded by a tangible volume, rather than 
on the hardware side. Therefore, we used augmented reality (AR) to 
simulate some hardware aspects of our concept. 

One may argue that existing geometric displays could have served 
as a basis to conduct user studies, without requiring AR simulation. 
However, existing implementations described in previous work were 
all lacking in key aspects of our concept. Some of these devices are 
fixed installations, not meant to be moved by the user [8]. Others 
are somewhat movable, but are still tethered to a larger worksta¬ 
tion [13, 17]. The tangible volume as intended in our concept is 


supposed to be fully portable and self-contained, and a tether wire 
could hinder manipulation [7]. Finally, some of these devices are 
only partially covered in screens [17, 18]. Even though this is suffi¬ 
cient to demonstrate the technology, for complex object manipulation 
(especially rotations) it is important that all sides of the volume pro¬ 
vide visual feedback. Simulating an interaction device to conduct 
user studies has been done in previous work [1]. This approach has 
the advantage of providing flawless rendering and tracking, which 
would be difficult to achieve in a research prototype but essential for 
the validity of user studies. 

We chose a cubic shape for the tangible volume in our prototype. 
Even though other shapes could have been used, we chose a cube 
because it was easier to build while also being easy to manipulate by 
users. This cube was covered with AR markers. We used a tactile 
tablet as an augmented reality window. When observed through 
the rear camera of the tablet, the faces of the cube were replaced 
with “virtual screens” that displayed part of the virtual scene with 
a fish-tank effect. A frame was added around each virtual screen, 
to account for the fact that real screens would not be completely 
borderless. Using AR provided implicit viewpoint tracking: since the 
cube faces were tracked by the camera, reversing this transformation 
produced an equivalent result. The cube was tracked relative to the 
real world by placing an additional AR marker in the environment. 
We employed the Vuforia framework 1 to track these objects. The 
tablet was attached to a raised stand, so that users could simply 
manipulate the object behind the tablet, as if they were directly 
looking at a cube equipped with screens. 

Our grasping technique was implemented in actual hardware, by 
attaching six flat pressure sensors (Interlink® FSR 406) to the faces 
of the tangible cube. The sensors were located under the AR mark¬ 
ers, and thus were invisible to the users. They were driven by a 
microcontroller 2 embedded in the cube (Figure 5). It was powered 
by a rechargeable battery, with a charging port hidden in one cor¬ 
ner of the cube. The microcontroller continuously transmitted the 
pressure values to the rendering software on the tablet, through a 
wireless Bluetooth connection, at a frequency of 10 Hz. Finally, 
we implemented physical simulation for the grasped objects with 
the Bullet 3 physics engine. 

In the rest of this paper, we use this prototype to investigate the 
intuitiveness of object manipulation with our grasping metaphor. 


1 http://www.vuforia.com/ 
2 RFduino RFD22102 
3 http://www.bulletphysics.org/ 





























Figure 5: View of the electronic components inside the cube. The 
embedded microcontroller retrieves values from the pressure sensors 
on the cube surface, and sends them to the rendering software on the 
tablet through a wireless connection. 


5 User study: object manipulation 

When interacting with the real world, grasping an object by pressing 
fingers on it, and moving it while maintaining finger pressure is a 
fairly natural procedure. However, it is unknown whether the exact 
same procedure remains natural when interacting with virtual objects 
through a tangible volume. We thus conducted an experiment to 
evaluate the intuitiveness of object selection and manipulation in 
our interface. 

More specifically, we wanted to see if users can understand by 
themselves with no prior explanations, how to grasp and move virtual 
objects with our interface. Of course, if some users do not succeed 
without help, this measure alone would not be sufficient to under¬ 
stand why. Therefore, we also wanted to determine at which step 
of manipulation those users would need guidance on their initial 
encounter with the interface. 

5.1 Participants and apparatus 

This study was conducted with 36 unpaid participants (from 20 
to 52 years old, mean=29.5, sd=9.4). None of them had any prior 
knowledge of our interface concept or our grasping metaphor. The 
apparatus was our AR-based prototype described in the previous 
sections. We slightly raised the pressure threshold to ensure that the 
grasping technique would only be triggered when truly intended by 
the participants. 

5.2 Procedure 

Before starting the experiment, we first told participants that it would 
consist in “manipulating the cube”. We then explained why there 
was also a tablet in front of them: to “simulate screens that should 
have been on the faces of the cube, but were not there due to technical 
limitations”. We followed by demonstrating how the cube turned 
into a cubic display when placed behind the tablet, and how a virtual 
scene could be seen through it. If our prototype had actual screens 
and head tracking, participants would likely have noticed all of this 
immediately. We did this short demonstration to ensure there was 
no confusion about our AR simulation. However, we gave them no 
explanation about how to interact with virtual objects. 

We then introduced participants to the task: moving a virtual 
apple to a nearby target. The target was represented by a circle on 
the virtual floor (Figure 6), 6 cm to the right of the virtual object. 
Participants were told this task would have to be done “with the 
cube”. They were also specifically told to ignore the tablet for 
manipulation. Finally, we asked them to discover how to do that “by 
themselves, as far as possible”. 

Obviously, we expected that some participants would not be able 
to complete the task without explanation. In order to understand how 



Figure 6: Screenshot of our first experiment, as seen by participants 
through the tablet. Note the lack of finger occlusion, due to the AR sim¬ 
ulation. The task was to pick up the virtual apple and move it to the 
red circle on the right. 

much help these participants needed, we designed a set of textual 
hints in increasing level of accuracy. During the task, there was a 
button that could be pressed to reveal a new hint. Each press on 
the button uncovered an additional hint on the tablet’s screen, in the 
following order: 

1. “Put the cube onto the apple” 

2. “Press the cube to grab the apple” 

3. “Move the cube while maintaining the pressure” 

These hints were specifically chosen to cover the different steps of 
object manipulation in our interface. Additionally, we chose to use 
textual hints rather than visual representations (e.g. arrows) to avoid 
ambiguous interpretations. We explained the role of the hint button 
to participants, and strongly encouraged them to use as few hints 
as possible. 

5.3 Results and discussion 

Figure 7 shows the percentage of participants according to the num¬ 
ber of hints they needed to complete the task. A total of 19 par¬ 
ticipants (53%) successfully completed the task without requesting 
any hint. Among the remaining participants, all but one (45%) suc¬ 
ceeded with the first two hints. The third hint was never used. We 
also report the task completion times for the participants who ac¬ 
complished the task without any hint. The mean completion time 
was 63.5 s (SD=34.3 s). 

More than half of participants discovered by themselves how 
to grasp and manipulate virtual objects with our tangible volume. 
This is an encouraging result, given that our interface is so much 
different from the way most people currently interact with virtual 
3D objects, and more generally interact with computers. At first 
glance, the completion times—about one minute on average—might 
seem long. However, this was the time needed to discover how to use 
a completely unfamiliar device and accomplish a manipulation task 
for the first time. This, it seems that the idea of grasping a virtual 
object “through” a tangible volume was spontaneously considered 
by a majority of participants, and within a reasonable time. 

For those who did not complete the task without help, the number 
of hints requested provides more insight into which parts of manip¬ 
ulation were troublesome. The first hint was designed to uncover 
potential difficulties in positioning the tangible volume onto the 
virtual object. Nearly all participants who requested hints were not 
helped with this first hint. Hence, positioning was likely not what 
prevented them from completing the task. Indeed, all participants 
quickly noticed that a outline appeared around the virtual object 
whenever it was inside the volume. Some of them, however, be¬ 
lieved that the outline meant the object was already selected, and 
attempted to move it without pressing the fingers. 

Among other attempted strategies, a surprising number of partici¬ 
pants attempted to “push” the virtual object with the tangible volume. 
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Figure 7: Number of hints and time needed by participants to discover 
how to manipulate a virtual object with our interface. 


This may indicate they thought the sides of the volume would be 
solid in the virtual scene. Many tried to tap or flick their fingers 
onto the surface of the volume. In some cases, it was apparently 
an attempt to replicate the “click” metaphor, especially when they 
grasped the tangible volume as if it were a mouse and tried to click 
(and even double-click) on the top. In other cases, it was clearly an 
attempt to affect the virtual object through the volume, especially 
when flicking a finger against one of the side faces. Finally, some 
participants tried to fully enclose the cube in their hands. This may 
have been an attempt to grasp the virtual object. However, in doing 
so they lost visual feedback, and it was thus impossible to see if 
finger pressure was sufficient to trigger the grasping technique. 

Nearly all participants who requested the first hint also requested 
the second hint to complete the task. The second hint was designed to 
uncover difficulties with the grasping metaphor itself. No participant 
ever asked for the third hint. For those who needed help, the grasping 
metaphor was therefore the main hurdle. The third hint was about 
moving the selected object by keeping it attached to the tangible 
volume. Since no participant requested this hint, none of them 
encountered any problem with this last step. 

6 Conclusion 

We introduced the concept of a “tangible volume”: a fully portable 
and self-contained device for 3D interaction, made of a single tan¬ 
gible object entirely covered with screens. In contrast to existing 
object-oriented or geometric displays, this tangible object represents 
a volume of the virtual scene and can be positioned directly onto 
virtual objects. This makes 3D interaction more direct than with pre¬ 
vious approaches. We described an object manipulation technique 
that consists in grasping virtual objects “through” the volume and 
moving them in 3D space. We created a partial prototype based on 
this concept, and used this prototype to investigate some aspects of 
its usability. In particular, we showed that the grasping metaphor 
was spontaneously understood by a majority of users. 

Future work should focus on two aspects. On the technical side, 
the first step would be to create a full implementation of our concept. 
This would require several improvements to the current technol¬ 
ogy. In particular, achieving reliable head tracking and environment 
tracking in a fully portable and self-contained device remains an 
important challenge, which has not been addressed so far by exist¬ 
ing prototypes (Section 2.3). While recent devices with multiple 
included sensors such as the Tango tablet improve the situation, they 
still have issues with 3D tracking and we expect that future tech¬ 
nological developments will provide better location data. On the 
experimental side, a more complete implementation would allow to 
repeat user studies in conditions that more closely resemble a real 
device. For example, the influence of correct finger occlusion could 
be studied in a prototype with actual screens, as well as the effect of 
possible imperfections in tracking and rendering. 
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